Skip to content

tests: cross-driver regression matrix harness#32

Merged
josephnef merged 1 commit into
masterfrom
feat/regress-test-rig
May 23, 2026
Merged

tests: cross-driver regression matrix harness#32
josephnef merged 1 commit into
masterfrom
feat/regress-test-rig

Conversation

@josephnef
Copy link
Copy Markdown
Collaborator

What this is

A manual-run Python orchestrator that compares devourer's userspace stack against the kernel driver (mainline rtw88 / out-of-tree aircrack-ng/rtl8812au) on a host with two plugged-in USB Wi-Fi adapters. Emits a markdown table — designed to paste into PR review comments.

                  TX = devourer        TX = kernel
RX = devourer     [end-to-end dvr]     [does dvr RX a kernel-TX frame?]
RX = kernel       [does dvr emit       [baseline / rig sanity]
                   valid frames?]

Each cell injects/receives the canonical beacon (SA 57:42:75:05:d6:00, matching txdemo/main.cpp) for --duration seconds and counts hits.

Why now

PRs like #30 (RTL8821AU partial bring-up) need cross-driver validation: "does devourer's TX really emit valid frames?" and "can devourer RX a frame the kernel driver knows works?". Running these checks manually is fiddly (modprobe / unbind / iw / tcpdump dance per cell); this script does it in one command and prints a structured result.

This is not a 24x7 CI runner — too few PRs to justify the infrastructure. It's a script the reviewer runs on demand on a test rig.

Usage

cd /path/to/devourer && cmake --build build -j
sudo python3 tests/regress.py --channel 100

See tests/README.md for full options + prereqs.

First-run validation on trainer-arch

Arch Linux, kernel 6.x, USB hub with 0bda:8812 (8812AU) + 0bda:8813 (8814AU):

## Regression matrix — channel 100
- TX adapter: 0bda:8813 (RTL8814AU)
- RX adapter: 0bda:8812 (RTL8812AU)

|   | TX = devourer | TX = kernel |
|---|---|---|
| RX = devourer | 0 hits / 10 TX (437 fail) / 10s ✗ | 0 hits / 0 TX / 0s ✗ |
| RX = kernel | 1 hits / 10 TX (351 fail) / 10s ✓ | 0 hits / 0 TX / 0s ✗ |

The devourer-TX(8814) → kernel-RX(8812) cell passed — independent confirmation that #29's 8814AU TX bring-up really does land frames on the air. The remaining cells correctly identified the rig's known limitations: mainline rtw88_8814au can't probe this 8814AU dongle on this kernel (failed to download firmware, probe error -22), and 8814AU RX is a pre-existing TODO.

Portability

  • Tool paths resolved via which (no /usr/bin/X hardcoding)
  • Wlan iface names discovered via iw dev (works for systemd wlp* and classic wlan*)
  • Kernel driver claiming each DUT read from sysfs (no hardcoded module names)
  • Preflight check prints distro-agnostic install hints if anything's missing
  • Tested on Arch; should work on any modern Linux with iw, tcpdump, python3-scapy, aircrack-ng

VM-readiness

The kernel-cell shell-outs all go through one function (run_kernel_cmd). Today: local exec. To migrate the kernel driver into a pinned-kernel VM (recommended once host kernel upgrades start breaking the out-of-tree aircrack-ng driver), wrap that function with ssh trainer-vm sudo and arrange USB hot-plug passthrough via libvirt. The matrix orchestrator doesn't need to change.

Known limitations (documented in README)

  • Tests "signal of life", not throughput — air noise makes absolute counts unreliable; default pass-threshold is 1 hit with guidance to bump for higher-confidence runs.
  • Sequential matrix takes ~100s for 4 cells (devourer fwdl warmup + 4 × ~25s).
  • Two-adapter scope today. Extending to >2 is a pairing loop in main().
  • One known bug: <devourer-tx>TX #N prints are rate-limited so when the chip is failing every send, the parser undercounts attempts. Mitigated by surfacing failure count separately in the output.

Test plan

  • Builds + runs on trainer-arch (Arch + kernel 6.x)
  • Markdown table emitted correctly
  • At least one cell passes against real hardware (8814 dvr-TX → 8812 kernel-RX)
  • Validate on a different distro (Ubuntu / Fedora) — anyone with a 2-adapter rig
  • Validate against the out-of-tree aircrack-ng/rtl8812au driver instead of mainline rtw88

🤖 Generated with Claude Code

Adds `tests/regress.py` — a manual-run Python orchestrator that compares
this project's userspace stack against the kernel-driver baseline for
both TX and RX on two plugged-in USB Wi-Fi adapters:

                  TX = devourer       TX = kernel
RX = devourer     end-to-end devourer  does dvr RX a kernel-TX frame?
RX = kernel       does dvr emit valid  baseline / rig sanity check
                  frames?

Each cell injects/receives the canonical beacon (SA 57:42:75:05:d6:00,
matching txdemo/main.cpp) for --duration seconds and counts hits. A
cell passes if hits >= --pass-threshold. Output is a markdown table —
designed to paste into PR comments.

Pieces:
- `tests/regress.py` — matrix orchestrator. Auto-detects DUTs via
  sysfs, handles per-cell kernel-driver bind/unbind, parses devourer
  log output, supports --no-baseline-abort for partial-rig setups
  where one chipset has no working kernel driver.
- `tests/inject_beacon.py` — standalone scapy injector for the
  kernel-TX cells. Emits the same beacon WiFiDriverTxDemo uses, so
  cross-driver SA matching works either direction.
- `tests/README.md` — usage, prereqs, distro-agnostic install hints,
  VM-readiness notes (kernel-cell shell-outs all go through one
  function — drop in `ssh trainer-vm sudo` to migrate the kernel
  driver into a pinned-kernel VM when host upgrades start breaking
  the out-of-tree aircrack-ng driver).

Portability: tool paths resolved via `which`, wlan iface names
discovered via `iw dev` (works for systemd's `wlp*` and classic
`wlan*`), kernel driver claiming each DUT read from sysfs (no
hardcoded module names). Preflight check prints actionable install
hints if anything's missing.

First-run validation on trainer-arch (Arch, kernel 6.x, 0bda:8812 +
0bda:8813 in a USB hub): the devourer-TX(8814) → kernel-RX(8812)
cell passed, proving devourer's RTL8814AU TX path (per #29) really
does emit frames the mainline rtw88 picks up. The remaining cells
correctly identified the rig's known limitations — mainline
rtw88_8814au can't probe this 8814AU dongle on this kernel (firmware-
download error -22), and 8814AU RX is a pre-existing TODO. README
explains how to interpret a partial matrix in that case.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@josephnef josephnef merged commit 63e737d into master May 23, 2026
5 checks passed
@josephnef josephnef deleted the feat/regress-test-rig branch May 23, 2026 09:18
josephnef added a commit that referenced this pull request May 23, 2026
…el) (#33)

## What this is

Adds a libvirt-VM execution mode to `tests/regress.py` so the
kernel-side cells of the regression matrix can run against the
`aircrack-ng/rtl8812au` out-of-tree driver on a **pinned kernel**,
instead of fighting the host kernel.

## Why a VM

The OOT `aircrack-ng/rtl8812au` driver lags kernel API changes by 6-12
months (timer_*, cfg80211 callback signatures with MLO link_id, etc.).
On kernel 6.15+ it needs hand-patching to build. morrownr's README flags
that mainline `rtw88_*` is now the recommended path from kernel 6.14
onwards — but **mainline `rtw88_8814au` currently fails to probe**
RTL8814AU on this lab's adapter (`failed to download firmware`, `error
-22`). So for 8814 specifically, OOT aircrack-ng is the only working
kernel-side path.

Pinning a VM to Ubuntu 22.04 LTS (kernel 5.15) gives a stable platform
where aircrack-ng's driver builds and loads cleanly. The host can
upgrade freely without breaking the test rig.

## Pieces

**`tests/setup_vm.sh`** — one-shot VM provisioner. Clones an Ubuntu
22.04 cloud image (`jammy-base.qcow2`), generates a cloud-init seed
(creates `dima` user with caller's SSH key, NOPASSWD sudo, installs
build-essential / dkms / linux-headers / iw / tcpdump / python3-scapy /
aircrack-ng), `virt-install`s with `qemu-xhci` USB controller for
hot-plug, runs `make dkms_install` of `aircrack-ng/rtl8812au` inside via
`runcmd`. ~5-10 min end to end. `--teardown` and `--status` subcommands
included.

**`tests/regress.py` refactor** — introduces a `KernelHost` abstraction
owning every kernel-side operation (`modprobe`, sysfs reads, `iw`,
`tcpdump`, scapy). Local mode = `subprocess.run`. VM mode = `ssh ...
sudo` + `virsh attach-device`/`detach-device` for per-cell USB
passthrough. New CLI flags `--vm-name` / `--vm-ssh` (env:
`DEVOURER_VM_NAME`, `DEVOURER_VM_SSH`). When invoked under `sudo`, picks
up `SUDO_USER`'s SSH key — root usually doesn't have keys provisioned on
the VM.

**Per-cell DUT routing** — each cell calls `_ensure_dut_location` for
each DUT, which (in VM mode) moves the DUT between host and VM via virsh
as needed. State always restored to \"both DUTs on host\" between cells
via try/finally so a crashed cell doesn't poison the next one. Script
start has a `release_all_known_duts` pass for leftover-attached DUTs
from previous aborted runs.

## Validation on trainer-arch

Arch Linux host kernel 6.18, VM Ubuntu 22.04 LTS kernel 5.15, two USB
DUTs in a hub (0bda:8812 RTL8812AU + 0bda:8813 RTL8814AU):

```
## Regression matrix — channel 100, 2026-05-23 13:22:14
- TX adapter: 0bda:8812 (RTL8812AU)
- RX adapter: 0bda:8813 (RTL8814AU)
- Kernel host: VM devourer-testrig via dima@10.216.129.126
- Cell duration: 10s
- Pass threshold: ≥ 3 hits

|   | TX = devourer | TX = kernel |
|---|---|---|
| RX = devourer | 0 hits / 4500 TX ✗ | 0 hits / 258 TX ✗ |
| RX = kernel | 4172 hits / 4500 TX ✓ | 229 hits / 259 TX ✓ |
```

- **Baseline ✓** kernel-TX 8812 → kernel-RX 8814 inside VM, **~88%
delivery**
- **devourer-TX validation ✓** devourer-TX 8812 on host → kernel-RX 8814
in VM, **~93% delivery** — confirms devourer's RTL8812AU TX really emits
valid frames at the wire level
- The two failing cells are the pre-existing devourer 8814 RX TODO, not
regressions; cell 3's new \"0 hits / 258 TX\" output correctly fingers
the RX side (TX side really did emit 258 frames; devourer-RX 8814
silent)

For comparison: the same hardware in local mode from #32's first run got
**1 hit** on the devourer-TX→kernel-RX cell because mainline
`rtw88_8814au` couldn't probe the chip. The VM with aircrack-ng gives
**~4000× the signal**.

## Smaller fixes folded in

- TX-count parser surfaces \"Failed to send packet\" failure count
separately from the rate-limited `<devourer-tx>` print count (previously
misleadingly low when sends were failing)
- `--no-baseline-abort` flag for partial-rig diagnostics
- `wait_for_wlan_iface` timeout bumped to 20s (kernel rebinds + VM
passthrough enumeration take 10s+)
- Kernel-TX cells `wait()` for `inject_beacon` to self-terminate instead
of killing the ssh wrapper — captures the final \"sent N frames\" line
(previously TX count showed 0 even though RX side received frames)

## Usage

```bash
sudo tests/setup_vm.sh                    # ~5-10 min, one-time
sudo tests/setup_vm.sh --status

sudo python3 tests/regress.py --channel 100 \
    --vm-name devourer-testrig \
    --vm-ssh dima@<VM-IP>
```

See [`tests/README.md`](tests/README.md) for full options, prereqs,
architecture notes.

## Known limitations (documented in README)

- VM mode assumes a single libvirt host running both `virsh` (locally)
and the VM. Pulling the VM onto a different host needs your own `virsh`
wrapper.
- Per matrix run: ~3-4 min in VM mode (USB hot-plug adds ~5s per cell
transition vs ~100s for local mode).
- Two-adapter scope today. >2 needs a pairing loop in `main()`.
- Cell 4 (`devourer-TX → devourer-RX`) needs both DUTs
devourer-claimable simultaneously — if one chipset has broken devourer
RX (current RTL8814AU TODO), that cell shows 0 regardless of TX.

## Test plan

- [x] VM provisioning succeeds end-to-end (`setup_vm.sh` clean run on
trainer-arch)
- [x] aircrack-ng/rtl8812au DKMS install works inside VM (kernel 5.15)
- [x] USB hot-plug of 8814AU into VM works (mainline rtw88 couldn't
probe; aircrack-ng claims cleanly)
- [x] Full 4-cell matrix runs end-to-end in VM mode
- [x] Baseline cell passes (rig sanity)
- [x] devourer-TX → kernel-RX cell passes (cross-driver validation)
- [x] Failing cells produce diagnostic output (TX count vs RX hits)
- [ ] Validate on a different distro / different VM base image
- [ ] Validate with a 2× same-chip DUT setup (both cells with
both-devourer pass)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef added a commit that referenced this pull request May 23, 2026
…el) (#33)

## What this is

Adds a libvirt-VM execution mode to `tests/regress.py` so the
kernel-side cells of the regression matrix can run against the
`aircrack-ng/rtl8812au` out-of-tree driver on a **pinned kernel**,
instead of fighting the host kernel.

## Why a VM

The OOT `aircrack-ng/rtl8812au` driver lags kernel API changes by 6-12
months (timer_*, cfg80211 callback signatures with MLO link_id, etc.).
On kernel 6.15+ it needs hand-patching to build. morrownr's README flags
that mainline `rtw88_*` is now the recommended path from kernel 6.14
onwards — but **mainline `rtw88_8814au` currently fails to probe**
RTL8814AU on this lab's adapter (`failed to download firmware`, `error
-22`). So for 8814 specifically, OOT aircrack-ng is the only working
kernel-side path.

Pinning a VM to Ubuntu 22.04 LTS (kernel 5.15) gives a stable platform
where aircrack-ng's driver builds and loads cleanly. The host can
upgrade freely without breaking the test rig.

## Pieces

**`tests/setup_vm.sh`** — one-shot VM provisioner. Clones an Ubuntu
22.04 cloud image (`jammy-base.qcow2`), generates a cloud-init seed
(creates a user with caller's SSH key, NOPASSWD sudo, installs
build-essential / dkms / linux-headers / iw / tcpdump / python3-scapy /
aircrack-ng), `virt-install`s with `qemu-xhci` USB controller for
hot-plug, runs `make dkms_install` of `aircrack-ng/rtl8812au` inside via
`runcmd`. ~5-10 min end to end. `--teardown` and `--status` subcommands
included.

**`tests/regress.py` refactor** — introduces a `KernelHost` abstraction
owning every kernel-side operation (`modprobe`, sysfs reads, `iw`,
`tcpdump`, scapy). Local mode = `subprocess.run`. VM mode = `ssh ...
sudo` + `virsh attach-device`/`detach-device` for per-cell USB
passthrough. New CLI flags `--vm-name` / `--vm-ssh` (env:
`DEVOURER_VM_NAME`, `DEVOURER_VM_SSH`). When invoked under `sudo`, picks
up `SUDO_USER`'s SSH key — root usually doesn't have keys provisioned on
the VM.

**Per-cell DUT routing** — each cell calls `_ensure_dut_location` for
each DUT, which (in VM mode) moves the DUT between host and VM via virsh
as needed. State always restored to \"both DUTs on host\" between cells
via try/finally so a crashed cell doesn't poison the next one. Script
start has a `release_all_known_duts` pass for leftover-attached DUTs
from previous aborted runs.

## Validation on trainer-arch

Arch Linux host kernel 6.18, VM Ubuntu 22.04 LTS kernel 5.15, two USB
DUTs in a hub (0bda:8812 RTL8812AU + 0bda:8813 RTL8814AU):

```
## Regression matrix — channel 100, 2026-05-23 13:22:14
- TX adapter: 0bda:8812 (RTL8812AU)
- RX adapter: 0bda:8813 (RTL8814AU)
- Kernel host: VM devourer-testrig via <user>@<VM-IP>
- Cell duration: 10s
- Pass threshold: ≥ 3 hits

|   | TX = devourer | TX = kernel |
|---|---|---|
| RX = devourer | 0 hits / 4500 TX ✗ | 0 hits / 258 TX ✗ |
| RX = kernel | 4172 hits / 4500 TX ✓ | 229 hits / 259 TX ✓ |
```

- **Baseline ✓** kernel-TX 8812 → kernel-RX 8814 inside VM, **~88%
delivery**
- **devourer-TX validation ✓** devourer-TX 8812 on host → kernel-RX 8814
in VM, **~93% delivery** — confirms devourer's RTL8812AU TX really emits
valid frames at the wire level
- The two failing cells are the pre-existing devourer 8814 RX TODO, not
regressions; cell 3's new \"0 hits / 258 TX\" output correctly fingers
the RX side (TX side really did emit 258 frames; devourer-RX 8814
silent)

For comparison: the same hardware in local mode from #32's first run got
**1 hit** on the devourer-TX→kernel-RX cell because mainline
`rtw88_8814au` couldn't probe the chip. The VM with aircrack-ng gives
**~4000× the signal**.

## Smaller fixes folded in

- TX-count parser surfaces \"Failed to send packet\" failure count
separately from the rate-limited `<devourer-tx>` print count (previously
misleadingly low when sends were failing)
- `--no-baseline-abort` flag for partial-rig diagnostics
- `wait_for_wlan_iface` timeout bumped to 20s (kernel rebinds + VM
passthrough enumeration take 10s+)
- Kernel-TX cells `wait()` for `inject_beacon` to self-terminate instead
of killing the ssh wrapper — captures the final \"sent N frames\" line
(previously TX count showed 0 even though RX side received frames)

## Usage

```bash
sudo tests/setup_vm.sh                    # ~5-10 min, one-time
sudo tests/setup_vm.sh --status

sudo python3 tests/regress.py --channel 100 \
    --vm-name devourer-testrig \
    --vm-ssh <user>@<VM-IP>
```

See [`tests/README.md`](tests/README.md) for full options, prereqs,
architecture notes.

## Known limitations (documented in README)

- VM mode assumes a single libvirt host running both `virsh` (locally)
and the VM. Pulling the VM onto a different host needs your own `virsh`
wrapper.
- Per matrix run: ~3-4 min in VM mode (USB hot-plug adds ~5s per cell
transition vs ~100s for local mode).
- Two-adapter scope today. >2 needs a pairing loop in `main()`.
- Cell 4 (`devourer-TX → devourer-RX`) needs both DUTs
devourer-claimable simultaneously — if one chipset has broken devourer
RX (current RTL8814AU TODO), that cell shows 0 regardless of TX.

## Test plan

- [x] VM provisioning succeeds end-to-end (`setup_vm.sh` clean run on
trainer-arch)
- [x] aircrack-ng/rtl8812au DKMS install works inside VM (kernel 5.15)
- [x] USB hot-plug of 8814AU into VM works (mainline rtw88 couldn't
probe; aircrack-ng claims cleanly)
- [x] Full 4-cell matrix runs end-to-end in VM mode
- [x] Baseline cell passes (rig sanity)
- [x] devourer-TX → kernel-RX cell passes (cross-driver validation)
- [x] Failing cells produce diagnostic output (TX count vs RX hits)
- [ ] Validate on a different distro / different VM base image
- [ ] Validate with a 2× same-chip DUT setup (both cells with
both-devourer pass)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant